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DETAILED ACTION 



1 . This action is responsive to communications: Amendment, filed on 12/7/2002. 
This action is non-final. 

2. Claims 1,2,6, 8-1 5, 18, 20-27 are pending in this application. Claims 1 , 1 3, 20 
and 24 are independent claims. In the Amendment, filed on 12/7/2007, claims 1, 2, 8, 
13, 20 and 24 were amended, claims 3-5, 7, 16, 17 and 19 were canceled, and claims 
26 and 27 were added. 

3. The present title of the invention is "System and method for synchronizing 
samples in a programmable graphics processing unit". 

Claim Rejections - 35 USC §102 

4. The text of those sections of Title 35, U.S. Code not included in this action can 
be found in a prior Office action. 

5. Claims 13-15 and 18 are rejected under 35 U.S.C. 102(e) as being anticipated by 
Lindholm etal. (US 7,015,913) 

The applied reference has a common assignee with the instant application. 
Based upon the earlier effective U.S. filing date of the reference, it constitutes prior art 
under 35 U.S.C. 102(e). This rejection under 35 U.S.C. 102(e) might be overcome 
either by a showing under 37 CFR 1 .1 32 that any invention disclosed but not claimed in 
the reference was derived from the inventor of this application and is thus not the 
invention "by another," or by an appropriate showing under 37 CFR 1 .131 . 
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6. As per claim 13, Lindholm et al., hereinafter Lindholm discloses a method for 
processing divergent graphics samples in a programmable graphics processing unit, the 
method comprising: 

processing samples of a group of samples in non-divergent mode ("FIG 6 ... 
Instruction Scheduler 430 to schedule the execution of program instructions to process 
several samples", column 13, line 60-42); 

determining whether each program counter of a plurality of program counters is the 
same, each program counter of the plurality of program counters corresponding to a 
different one of the samples of the group of samples ("In one embodiment, instructions 
with equal program counter are considered synchronized", column 14, line 2-3); 

determining whether each subroutine depth of a plurality of subroutine depths is the 
same, each subroutine depth of the plurality of subroutine depths corresponding to a 
different one of the samples of the group of samples ("In another embodiment, in 
addition to program counters, thread state data such as stack depths, nesting levels, 
subroutine calls, or the like are used to determine two or more threads are 
synchronized", column 14, line 3-7); 

processing each of the other samples in the group of samples in non-divergent 
mode, after processing the one or more divergent samples ("Instruction Scheduler 430 
determines if a synchronization mode is enabled. If in step 620 Instruction Scheduler 
430 determines a synchronization mode is enabled, in step 625 Instruction Schedular 
430 checks for synchronization and proceeds to step 630", column 13, line 65- column 
14, line 2, where prior to the step, divergent samples were processed). 
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7. As per claim 14, Lindholm demonstrated all the elements as disclosed in the 
rejected claim 13, and further discloses the step of processing one or more divergent 
samples through a remainder of a program if a first program counter of the plurality of 
program counters is different than a second program counter of the plurality of program 
counters ("In step 545 Execution Unit 470 also updates the program counter associated 
with the thread when a branch or loop instruction is executed and the program counter 
is different than the program counter updated in step 540. In step 547 Execution Unit 
470 determines there are no more instructions in the thread, and, if so, return to step 
535", column 13, line 6-11). 

8. As per claim 15, Lindholm demonstrated all the elements as disclosed in the 
rejected claim 14, and further discloses the first program counter being different than 
the second program counter results from a conditional branch or a jump (column 1 3, 
line 7-8). 

9. As per claim 18, Lindholm demonstrated all the elements as disclosed in the 
rejected claim 13, and further discloses the first subroutine depth being different than 
the second subroutine depth relates to a call-return ("in addition to program counters, 
thread state data such as stack depths, nesting levels, subroutine calls, or the like are 
used to determine thread age", column 9, line 11-14, where the subroutine call 
represents a call-return). 

Claim Rejections - 35 USC § 103 

10. Claims 20 and 23 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Rishi et al. (US 5,953,530) and further in view of Miller et al. (US 2004/0068730). 
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As per claim 20, Rishi et al., hereinafter Rishi, discloses a system for 
synchronizing divergent graphics samples in a programmable graphics processing unit, 
the system comprising: 

a plurality of processing threads, each processing thread corresponding to a 
different sample of a group of samples and configured to contain a program counter, a 
subroutine depth and state data ("FIG. 4 depicts a representation multi-processor 
machine configuration which would be typical for use with a multi-threaded target 
program", column 10, line 49-51); and 

a plurality of stacks, each stack corresponding to a different sample of the group of 
samples and configured to store state data in one or more stack levels ("A thread has a 
program counter (PC) and a stack to keep track of local variables and return 
addresses", column 1 , line 45-47). 

Rishi discloses a method of synchronizing divergent graphics sample. It is noted 
that Rishi does not explicitly disclose "wherein a first portion of each stack resides in a 
dedicated local storage resource and a second portion of each stack resides in local 
memory". However, this is known in the art as taught by Miller et al., hereinafter Miller. 
Miller discloses a method of affinitizing threads in which "each of the threads 2101 to 
260k maintains its own local variables and local resources such as program counter. 
They also share common global variables and memory", [0041]. 

Thus, it would have been obvious to incorporate the teaching of Miller into Rishi 
because Rishi discloses a method of synchronizing divergent graphics samples and 
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Miller discloses stack memory could be stored in global an local memory in order to 
provide flexibility in sharing data. 

11. . Claim 23, contains limitation include in claim 20, therefore is for the similar 
reason as claim 20. 

12. Claims 21 -22 are rejected under 35 U.S.C. 1 03(a) as being unpatentable over 
Rishi et al. and Miller et al. as applied to claim 20 above, and further in view of 
Cosgrove et al. (4,399,507). 

As per claim 21 , Rishi and Miller demonstrated all the elements as disclosed in 
the rejected claim 20. 

Rishi and Miller disclose a method of synchronizing divergent graphics samples. 
It is noted Rishi and Miller do not explicitly disclose wherein the subroutine depth of a 
first sample is equal to the number of the one or more stack levels of a first stack that 
contain state data, the first stack corresponding to the first sample, however, this is 
known in the art as taught by Cosgrove et al., hereinafter Cosgrove. Cosgrove discloses 
an instruction-pipelined processor in which "64 level stack 10 which is addressed with a 
6-bit stack Pointer (SP) 28 allows nesting up to 64 levels of Subroutine and Interrupt 
routines", column 1 1 , line 59-61 ). 

Thus, it would have been obvious to one of ordinary skill in the art to incorporate 
the teaching of Cosgrove into Rishi and Miller because Rishi and Miller disclose a 
method synchronizing divergent graphics samples and Cosgrove discloses the 
subroutine instruction can be tracked with leveled stack in order to track the routines. 
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1 3. As per claim 22, Rishi and Miller demonstrated all the elements as disclosed in 
the rejected claim 20. 

Rishi and Miller disclose a method of synchronizing divergent graphics samples. 
It is noted Rishi and Miller do not explicitly disclose wherein each stack resides in a 
dedicated local storage resource, however, this is known in the art as taught by 
Cosgrove. Cosgrove discloses an instruction-pipelined processor in which stack is 
stored locally (Figure 5, item 10). 

Thus, it would have been obvious to one of ordinary skill in the art to incorporate 
the teaching of Cosgrove into Rishi and Miller because Rishi and Miller disclose a 
method synchronizing divergent graphics samples and Cosgrove a locally stored stack 
could be used to rack subroutines in order to conveniently tracking the routine. 

Specification 

14. Claim 23 is objected to under 37 CFR 1.75(c), as being of improper dependent 
form for failing to further limit the subject matter of a previous claim. Applicant is 
required to cancel the claim(s), or amend the claim(s) to place the claim(s) in proper 
dependent form, or rewrite the claim(s) in independent form. Claim 23 limitation does 
not further limit claim 20. 

Allowable Subject Matter 

1 5. Claims 1,2,6, 8-1 2 and 24-27 are allowed. 

The following is a statement of reasons for the indication of allowable subject 
matter: 
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As per claims 1 and 24, the closest prior art by Lindholm et al. (7,01 5,91 3), 
Puzak or Kishi do not explicitly disclose 

synchronizing the subset of samples with the other samples of the group for 
processing a next instruction in the instruction sequence only if all the samples of the 
subset have encountered the first synch token. 

As per claims 13-15, 18 and 20-23, upon further consideration of existing and 
newly found prior art, the claims are rejected as stated above. 

Conclusion 

16. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Inquiries 

1 7. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Ryan R. Yang whose telephone number is (571) 272- 
7666. The examiner can normally be reached on M-F 8:30AM-5:00PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Michael Razavi can be reached on (571) 272-7664. The fax phone number 
for the organization where this application or proceeding is assigned is (571) 273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 

/Ryan R Yang/ 

Primary Examiner, Art Unit 2628 
March 14, 2008 



